Scalable Sequential Rough Parallel Bounded Symmetrical Clustering for Gene Expression Profile Analysis
نویسنده
چکیده
The study on gene expression profiling of tissues and cells has become a major tool for discovery in medicine. Identification of co-expressed genes and coherent patterns is the central goal in gene expression profiling and the important task in the field of bioinformatics research. Clustering is an important unsupervised learning technique for Gene Expression Profile Analysis. Many conventional clustering algorithms have been adapted or directly applied to gene expression data. Among them, Rough Point Symmetry (RoughPsym) and Rough Symmetry (Roughsym) based clustering is applied for recognizing symmetrical patterns of gene expression profiles. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. In case of RoughPsym and Roughsym methods, efficiency or higher accuracy is not achieved because of the larger dataset samples. To solve this problem and to further enhance the clustering and thereby enabling the clustering results of large microarray data, in this article, a distributed time-efficient scalable Sequential Rough Parallel Bounded Symmetrical clustering (SeqRoughPBSym) is applied to rough set based approach.
منابع مشابه
Gene microarray data analysis using parallel point-symmetry-based clustering
Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm....
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملFPF-SB : A Scalable Algorithm for Microarray Gene Expression Data Clustering
Efficient and effective analysis of large datasets from microarray gene expression data is one of the keys to time-critical personalized medicine. The issue we address here is the scalability of the data processing software for clustering gene expression data into groups with homogeneous expression profile. In this paper we propose FPF-SB, a novel clustering algorithm based on a combination of ...
متن کاملProposal for Developing an Approach to Constraints Based Multi-dimensional Data Clustering Aided with Associative Clustering Using Comparative Study
Our paper outlines a proposal for developing an approach to constraints based multi-dimensional data clustering aided with associative clustering. Our proposed approach evolved from comparative study of associative clustering, gene expression, multi-dimensional large data sets in a distributed environment from state of the art. Our proposed approach will work well in many of the emerging distri...
متن کاملScalable Problems and Memory-bounded Speedup Scalable Problems and Memory-bounded Speedup
In this paper three models of parallel speedup are studied. They are xed-size speedup, xed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead, and gives more accurate estimation. Anoth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016